343 research outputs found
CANF-VC++: Enhancing Conditional Augmented Normalizing Flows for Video Compression with Advanced Techniques
Video has become the predominant medium for information dissemination,
driving the need for efficient video codecs. Recent advancements in learned
video compression have shown promising results, surpassing traditional codecs
in terms of coding efficiency. However, challenges remain in integrating
fragmented techniques and incorporating new tools into existing codecs. In this
paper, we comprehensively review the state-of-the-art CANF-VC codec and propose
CANF-VC++, an enhanced version that addresses these challenges. We
systematically explore architecture design, reference frame type, training
procedure, and entropy coding efficiency, leading to substantial coding
improvements. CANF-VC++ achieves significant Bj{\o}ntegaard-Delta rate savings
on conventional datasets UVG, HEVC Class B and MCL-JCV, outperforming the
baseline CANF-VC and even the H.266 reference software VTM. Our work
demonstrates the potential of integrating advancements in video compression and
serves as inspiration for future research in the field
Transformer-based Image Compression with Variable Image Quality Objectives
This paper presents a Transformer-based image compression system that allows
for a variable image quality objective according to the user's preference.
Optimizing a learned codec for different quality objectives leads to
reconstructed images with varying visual characteristics. Our method provides
the user with the flexibility to choose a trade-off between two image quality
objectives using a single, shared model. Motivated by the success of
prompt-tuning techniques, we introduce prompt tokens to condition our
Transformer-based autoencoder. These prompt tokens are generated adaptively
based on the user's preference and input image through learning a prompt
generation network. Extensive experiments on commonly used quality metrics
demonstrate the effectiveness of our method in adapting the encoding and/or
decoding processes to a variable quality objective. While offering the
additional flexibility, our proposed method performs comparably to the
single-objective methods in terms of rate-distortion performance
Transformer-based Variable-rate Image Compression with Region-of-interest Control
This paper proposes a transformer-based learned image compression system. It
is capable of achieving variable-rate compression with a single model while
supporting the region-of-interest (ROI) functionality. Inspired by prompt
tuning, we introduce prompt generation networks to condition the
transformer-based autoencoder of compression. Our prompt generation networks
generate content-adaptive tokens according to the input image, an ROI mask, and
a rate parameter. The separation of the ROI mask and the rate parameter allows
an intuitive way to achieve variable-rate and ROI coding simultaneously.
Extensive experiments validate the effectiveness of our proposed method and
confirm its superiority over the other competing methods.Comment: Accepted to IEEE ICIP 202
- …